The LOQs varied between years; therefore, to avoid the potential of LOQ serving as a confounding variable, the highest value observed was uniformly applied. Values falling below the LOQ were assigned a zero, yielding conservative estimates of concentrations, and in comparison to assigning LOQ/2, resulting in a marginal rise in the coefficient estimates associated with mud.
Modeling
Fit
A generalized linear model with Poisson-gamma distributed errors and a logarithmic link function was employed to analyze the zero-inflated data censored by the limits of quantification. This model incorporated the predictors Area, TOC, and Mud and was implemented using the ‘glmmTMB’ package in R. Data exploration showed Mud to be collinear with TOC, and stepwise model selection based on Akaike’s information criteria showed Mud to have higher predictive value than TOC. Thus, Area and Mud were retained for the regression analysis. To test for differences between the Areas, pairwise comparisons were performed using the ‘multcomp’ package with single-step p-value adjustment. Simulation-based model validation was carried out using the ‘DHARMa’ and ‘ggeffects’ packages. The code to reproduce the statistical analysis is provided in an online repository.
Shown are the model fits on the original scale, with observations shown by black dots. Small red dots shows 25 simulations from the model for each combination of observed covariate values.
Plots above show scaled quantile residuals using the DHARMa package. These are obtained by simulating from the fitted model and comparing to the observed residuals. The DHaRMa residuals are expected to be uniformly distributed around 0, red lines parallel and near the dotted grey lines. The quantile-quantile plot are expected to follow the 45-degree line. Plots are grouped by PFAS and Area.
Comments:
Results
Overview and summary
Code
map_Area_conc <-ggmap(tar_read(stadia_map)) +geom_point(aes(x = longitude, y = latitude, color = Σ9PFAS, fill = Σ9PFAS, shape = Area), data = pfas_data_eq0_wide, position =position_dodge(width =10), size =2, alpha =0.7) +labs(x ="Longitude [°E]", y ="Latitude [°N]") +scale_color_viridis(option ="magma") +scale_fill_viridis(option ="magma") +scale_shape_manual(values =c(21, 22, 23, 24, 25)) +#change legend label to Area and \u03A39PFAS:labs(color ="\u03A39PFAS", shape ="Area")+guides(fill =FALSE)map_Area_conc
Code
## year# ggmap(tar_read(stadia_map)) +# geom_point(aes(x = longitude, y = latitude, color = year, fill = year, shape = Area), data = pfas_data_eq0_wide, position = position_dodge(width = 10), size = 2, alpha = 0.7) +# labs(x = "Longitude [°E]", y = "Latitude [°N]") +# scale_color_viridis(option = "magma") +# scale_fill_viridis(option = "magma") +# scale_shape_manual(values = c(21, 22, 23, 24, 25)) +# #change legend label to Area and \u03A39PFAS:# labs(color = "Year", shape = "Area")+# guides(fill = FALSE)## depth# ggmap(tar_read(stadia_map)) +# geom_point(aes(x = longitude, y = latitude, color = depth, fill = depth, shape = Area), data = pfas_data_eq0_wide %>% filter(depth < 200), position = position_dodge(width = 10), size = 2, alpha = 0.7) +# labs(x = "Longitude [°E]", y = "Latitude [°N]") +# scale_color_viridis(option = "magma") +# scale_fill_viridis(option = "magma") +# scale_shape_manual(values = c(21, 22, 23, 24, 25)) +# #change legend label to Area and \u03A39PFAS:# labs(color = "Depth [m]", shape = "Area")+# guides(fill = FALSE)
Code
pfas_data_eq0_wide %>%group_by(Area) %>%# mean, sd for each Area for each of Σ9PFAS, TOC, Mud:summarise(across(Mud:Σ9PFAS, mean, na.rm =TRUE),n =n() ) %>%gt() %>%fmt_number(columns =2:13, decimals =1) %>%tab_spanner(label ="Mean", columns =2:13) %>%tab_header(title ="Summary of covariates by Area, <LOQ = 0") %>%cols_label(Area =md("**Area**"),n =md("**n**"), Σ9PFAS =md("**\u03A39PFAS**"),PFHxDA =md("**PFHxDA**"),PFHpDA =md("**PFHpdA**"),PFDA =md("**PFDA**"),PFOS =md("**PFOS**"),PFOA =md("**PFOA**"),PFNA =md("**PFNA**"),PFTrDA =md("**PFTrDA**"),PFDoDA =md("**PFDoDA**"),PFUnDA =md("**PFUnDA**"),TOC =md("**TOC**"),Mud =md("**Mud**") )
Figure 1: Relative contribution of Mud to the concentration of different PFASes. 1.01 corresponds to 1 % increase of PFAS concentration per % increase in Mud.
Figure 2: Residuals of the model for PFAS concentration vs Mud plotted against latitude. The red line is the fitted loess curve. The residuals are not independent of latitude, indicating that the model does not capture all the variation in the data. We see that this is driven primarily by Areas I and V.
Applying the model with covariates ‘C{PFAS} = Mud + latitude’, yields the significance of covariates shown in table above. Latitude is not a significant covariate for Σ9PFAS. Some trends in the individual PFASes point toward lower concentrations at higher latitudes for PFOS, PFUnA, and PFDcA, whereas PFOA may increase going northward. However, as shown in Figure 2, Areas I and V exert leverage on the data and local hotspots serve as hotspots driving much of the observed spatial variance. Thus, differences (and lack of thereof) may be driven largely by locations I and V. We also see that the variation is bigger for Areas I and V, and correspondingly, for low and high latitudes. This makes inference on the background levels difficult.